Logistic Model Trees with AUC Split Criterion for the KDD Cup 2009 Small Challenge

نویسندگان

  • Patrick Doetsch
  • Christian Buck
  • Pavlo Golik
  • Niklas Hoppe
  • Michael Kramp
  • Johannes Laudenberg
  • Christian Oberdörfer
  • Pascal Steingrube
  • Jens Forster
  • Arne Mauser
چکیده

In this work, we describe our approach to the “Small Challenge” of the KDD cup 2009, the prediction of three aspects of customer behavior for a telecommunications service provider. Our most successful method was a Logistic Model Tree with AUC as split criterion using predictions from boosted decision stumps as features. This was the best submission for the “Small Challenge” that did not use additional data from other feature sets. A second approach using an AUC-optimized weighted linear combination of several rankings scored slightly worse with an average AUC of 0.8074. From the given 230 features, we extracted additional binary features and imputed missing values using SVMs and Decision Trees.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerating AdaBoost using UCB

This paper explores how multi-armed bandits (MABs) can be applied to accelerate AdaBoost. AdaBoost constructs a strong classifier in a stepwise fashion by adding simple base classifiers to a pool and using their weighted “vote” to determine the final classification. We model this stepwise base classifier selection as a sequential decision problem, and optimize it with MABs. Each arm represents ...

متن کامل

Winning the KDD Cup Orange Challenge with Ensemble Selection

We describe our wining solution for the KDD Cup Orange Challenge.

متن کامل

Predicting customer behaviour: The University of Melbourne's KDD Cup report

We discuss the challenges of the 2009 KDD Cup along with our ideas and methodologies for modelling the problem. The main stages included aggressive nonparametric feature selection, careful treatment of categorical variables and tuning a gradient boosting machine under Bernoulli loss with trees.

متن کامل

The 2009 Knowledge Discovery in Data Competition ( KDD Cup 2009 ) Challenges in Machine Learning

We organized the KDD cup 2009 around a marketing problem with the goal of identifying data mining techniques capable of rapidly building predictive models and scoring new entries on a large database. Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offered the opportunity to work on large marketing databases from the French Telecom company...

متن کامل

Analysis of the KDD Cup 2009: Fast Scoring on a Large Orange Customer Database

We organized the KDD cup 2009 around a marketing problem with the goal of identifying data mining techniques capable of rapidly building predictive models and scoring new entries on a large database. Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offered the opportunity to work on large marketing databases from the French Telecom company...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009